Specializing Word Embeddings for Similarity or Relatedness

Abstract

Compare two variants of retrofitting and a joint-learning approach.

Although the word embeddings is sometimes desirable, it may in other cases be detrimental to downstream performance. For example, when classifying documents by topic, we are particularly interested in related words rather than similar ones.

Specializing for similarity is achieved by learning from both a corpus and a thesaurus.

For relatedness by learning from both a corpus and a collection of psychological association norms.

  1. Graph-based retrofitting
  2. Skip-gram retrofitting
  3. Sikp-gram joint-learning approach

Approach

The underlying assumption of our approach is that, during training, word embeddings can be “nudged” in a particular direction by including infromation from an additional semantic data source.

They propose Skip-gram retrofitting which contains two training stages. In the first stage, a standard skip-gram model is trained. In the second stage, it is trained on the additional contexts.

Result

The result of document classification is not good.

分享到